High-Performance XML Twig Filtering using GPUs

نویسندگان

  • Ildar Absalyamov
  • Roger Moussalli
  • Vassilis J. Tsotras
  • Walid A. Najjar
چکیده

Current state of the art in information dissemination comprises of publishers broadcasting XML-coded documents, in turn selectively forwarded to interested subscribers. The deployment of XML at the heart of this setup greatly increases the expressive power of the profiles listed by subscribers, using the XPath language. On the other hand, with great expressive power comes great performance responsibility: it is becoming harder for the matching infrastructure to keep up with the high volumes of data and users. Traditionally, general purpose computing platforms have generally been favored over customized computational setups, due to the simplified usability and significant reduction of development time. The sequential nature of these general purpose computers however limits their performance scalability. In this work, we propose the implementation of the filtering infrastructure using the massively parallel Graphical Processing Units (GPUs). We consider the holistic (no post-processing) evaluation of thousands of complex twig-style XPath queries in a streaming (single-pass) fashion, resulting in a speedup over CPUs up to 9x in the single-document case and up to 4x for large batches of documents. A thorough set of experiments is provided, detailing the varying effects of several factors on the CPU and GPU filtering platforms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-Performance Holistic XML Twig Filtering Using GPUs

Current state of the art in information dissemination comprises of publishers broadcasting XML-coded documents, in turn selectively forwarded to interested subscribers. The deployment of XML at the heart of this setup greatly increases the expressive power of the profiles listed by subscribers, using the XPath language. On the other hand, with great expressive power comes great performance resp...

متن کامل

Value-based predicate filtering of XML documents

In recent years, publish–subscribe systems based on XML filtering have received much attention in ubiquitous computing environments and Internet applications. The main challenge is to process a large number of content against millions of user subscriptions. Several XML filtering systems focus on the efficient processing of structural matching of user subscriptions represented as XPath twig patt...

متن کامل

IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES The Space Complexity of Processing XML Twig Queries Over Indexed Documents

Current twig join algorithms incur high memory costs on queries that involve child-axis nodes. In this paper we provide an analytical explanation for this phenomenon. In a first large-scale study of the space complexity of evaluating XPath queries over indexed XML documents we show the space to depend on three factors: (1) whether the query is a path or a tree; (2) the types of axes occurring i...

متن کامل

Efficient and Scalable Sequence-Based XML Filtering

The ubiquitous adoption of XML as the standard of data exchange over the web has led to increased interest in building efficient and scalable XML publish-subscribe (pub-sub) systems. The central function of an XML-based pub-sub system is to perform XML filtering efficiently, i.e. identify those XPath expressions that have a match in a streaming XML document. In this paper, we propose a new sequ...

متن کامل

Answering XML Twig Queries with Automata

XML is emerging as a de facto standard for information representation and data exchange over the web. Evaluation of twig queries, which allows users to find all occurrence of a multiple branch pattern in an XML database, is a core and complicate operation for XML query processing. Performance of conventional evaluation approaches based on structural join declines with the expansion of data size...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013